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1 

2 APPARATUS AND METHOD FOR DETECTING, 
IDENTIFYING AND INCORPORATING ADVERTISEMENTS 

3 IN A VIDEO 

4 
5 

6 The present invention relates to apparatus 

7 and methods for superimposing a small video image into 

8 a larger video image. 
9 

10 
11 

12 International sports events or other 

13 spectacles generally draw the interest and attention of 

14 spectators in many countries. For example, the 

15 Olympics, Superbowl, World Cup, major basketball and 

16 soccer games, auto races etc. fit into this category. 

17 Such events ar6 generally broadcast live by video to a 

18 large international audience. The locale in which 

19 these events take place, such as stadiums or courts, 

20 provide advertising space all around in the form of 

21 signs, posters or other displays on fences and 

22 billboards, and in fact on any unoccupied space 

23 suitably located, including sections of the playing 

24 field. 

25 Due to the nature of the displays, which are 

26 mostly in the form of printed matter, they are not 

27 changed too frequently and remain at least for a day, 

28 or a series or a whole season, and are directed mostly 

29 at local audiences. In cases where two teams from 

30 different countries play each other, the advertisements 

31 are usually arranged so that one side of the stadium 

32 contains advertisements directed to audiences in one 

33 country, while the other side has advertisements 

34 directed to the spectators in the other country. 

35 The video cameras in these instances film the 

36 event from opposite sides of the stadium for their 

37 respective audiences. This of course is logistically 

38 complicated and limits the angle from which the events 
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1 can be seen in either of the countries represented in 

2 the game, 

3 Another limitation to present methods of 

4 advertising is the stringent safety requirements for 

5 positioning the billboards, so as not to interfere with 

6 the game, nor disturb the view of the spectators in the 

7 stadium, nor pose a danger to the players. The 

8 displays must not be too close to the actual field of 

9 action, so as not to distract the players. 

10 A most serious drawback of the present system 

11 for advertising at major world sports events is the 

12 fact that although the event is televised live 

13 throughout the world, the actual physical 

14 advertisements in the stadium, because of their broad 

15 international exposure, can only cater to products 

16 having a world market, 

17 Local advertisers can only make use of such 

18 world-class televised events by locally superimposing 

19 messages on the TV screen, or by interrupting the real 

20 time of the event. 

21 Another drawback of the existing system is 

22 that over long time periods, due to the scanning of the 

23 TV camera, the signs appear too blurred to be read by 

24 the TV viewers. On many other occasions, only part of 

25 the sign is visible to the TV viewers and the sign 

26 cannot be read. 

27 The following reference, the disclosure of 

28 which is incorporated herein by reference, describes 

29 Gaussian edge detection: 

30 J.F. Canny, "A computational approach to edge 

31 detection", IEEE Trans. Pattern Analysis and Machine 

32 Intelligence, Vol. 8,' pp. 679-698, November, 1986. 
33 

34 
35 
36 
37 

38 2. 
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1 

2 
3 

4 The present invention relates to a system and 

5 method for detecting, identifying and scaling in a 

6 video frame, suitable distinct targets and areas and 

7 inserting into these areas virtual images stored in the 

8 memory of the system, so that all objects or shadows in 

9 front of the distinct areas blocking portions thereof 

10 from view will be seen in a video transmission as being 

11 in front of and blocking the same portions of the areas 

12 containing virtual images. 

13 A particular feature of the invention is to 

14 operate the system in real time. The invention also 

15 provides apparatus for operating the system. The 

16 invention is particularly useful for advertising in 

17 sports courts. 

18 It is an object of the present invention to 

19 provide a system and method for video transmission of 

20 active events, for example sports events, having in the 

21 background physical images in designated targets, 

22 wherein the physical images are electronically 

23 exchanged with preselected virtual images, so that 

24 objects or shadows actually blocking portions of the 

25 physical images will be seen by viewers as blocking the 

26 same portions of the virtual images, and the motion of 

27 players or a ball blocking the physical image will 

28 block corresponding regions of the exchanged virtual 

29 image, so that the exchanged electronic image will 

30 remain in the background of the event, exactly as the 

31 original image. 

32 In a preferred embodiment of the present 

33 invention, the physical image to be substituted is 

34 detected, recognized, and located automatically and is 

35 replaced within one TV frame so that the original image 

36 is not perceptible to the TV viewers. In this 

37 embodiment no man. is required in the loop during line 

38 broadcasting. 



3 
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1 Since the same physical image may be captured 

2 by a plurality of TV cameras deployed in various 

3 locations around the court, and each camera usually has 

4 a continuous zoom lens, the system is able to detect 

5 and identify a certain physical target in all possible 

6 spatial orientations and magnifications of the target. 

7 The system is also capable of unequivocally 

8 identifying the scale and perspective of the physical 

9 target and normalizing the implanted virtual image into 

10 the same perspective. 

11 Another object of the invention is to provide 

12 a system and method of implanting in video 

13 transmission, virtual images in predetermined "free" 

14 background areas generally unsuitable for displaying 

15 physical signs, like he sports court itself. 

16 In a preferred embodiment of the present 

17 invention, the task of detection and identification of 

18 these free ares is executed automatically. 

19 A further object of the present invention is 
2p to automatically identify cases in which the physical 

21 billboard appears blurred due to camera scanning or 

22 jitter and to replace the blurred sign with a clearer 

23 one or to alternatively apply the same blurring degree 

24 to the replacing sign so that it will have an 

25 appearance similar to its neighboring signs. 

26 Yet another object of the present invention 

27 is to automatically identify a case in which only a 

28 small portion of the billboard is visible in the 

29 camera 'a field of view and to replace this small 

30 portion with the whole image of the original billboard. 

31 Still another object of the invention is to 

32 automatically identify cases in which the resolution of 

33 the captured billboard image is not sufficient for the 

34 TV viewers and to electronically replace them with 

35 larger virtual billboards so that their message may be 

36 conveniently captured by the viewers. 

37 Another object of the invention is to perform 

38 the implantation described above on a succession of 
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1 video frames. 

2 Yet another object of the invention is to 

3 provide the above system and method for electronic 

4 exchange or planting of virtual images in real time- 

5 A further object of the invention is to 

6 provide a system and method for video broadcasting the 

7 same event to different populations of viewers in real 

8 trme, with different electronic messages substituted in 

9 the spaces occupied by physical displays. 

10 Still another object of the invention is to 

11 provide a system and method for utilization of 

12 available space in a stadium unused by physical 

13 displays for the purpose of advertising by planting 

14 therein electronic virtual images during real time 

15 broadcasting of an event taking place in a stadium. 

16 Still a further object of the invention is to 

17 provide apparatus for use in video transmission for 

18 exchanging physical images with virtual images or 

19 planting virtual images in unused background areas 

20 during an event in real time video transmission, 

21 without disturbing the actual transmission of the 

22 event. 

23 In accordance with a preferred embodiment of 

24 the present invention, there is provided a system and 

25 method for broadcasting active events being captured by 

26 a TV camera, wherein virtual images are electronically 

27 substituted in or superimposed on targets selected from 

28 physical displays and preselected background regions, 

29 including an electronic data bank of event locales and 

30 targets therein, a memory unit for storing digitized 

31 virtual images for substitution in the targets, 

32 apparatus for grabbing and digitizing video frames, 

33 apparatus for automatic target searching in digitized 

34 video frames and for detecting targets therein, 

35 apparatus • for localization, verifying and identifying 

36 the targets, apparatus for comparing the detected 

37 targets with corresponding targets in the data bank, 

38 apparatus for scaling and identifying the perspective 
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1 of the original target and transforming the virtual 

2 substitute image into the same scale and perspective, 

3 apparatus for real-time video tracking of a detected 

4 target throughout a succession of frames, and for the 

5 identification of target magnification (zoom) or 

6 changes in perspective, apparatus for distinguishing 

7 between non-background objects and shadows that block 
portions of the detected targets, apparatus for 



8 

9 electronically transferring the objects and shadows 

10 from the original video frame to the substituted frame, 

11 apparatus for inserting the electronically transformed 

12 virtual image into the video frame substituting the 

13 original image in the target without this 

14 transformation being perceptible by the viewers, 

15 apparatus for receiving and storing virtual images and 

16 generating a virtual images data bank, apparatus for 

17 generating a locale data bank either prior or during an 

18 event (a learning system) and video signal input-output 

19 apparatus. 

20 For this purpose the system uses a special 

21 method for the automatic detection and identification 

22 of targets using one or more of the following 

23 attributes: 

24 - geometry -r such as the physical configuration 

25 of billboards (rectangular shape or parallel lines 

26 attribute) as seen from different angles and 

27 magnifications, 

28 - texture of slogans and graphics - such as for 

29 example in posters, 

30 - character recognition, 

31 _ field or court lines - which serve as 

32 references for designating free court areas, 

33 - standard objects that have typical shape and 

34 texture - such as post, backboard, basket and/or a 

35 player's shirt, 

36 - colour, and 

37 - objects and shadows temporarily blocking 

38 portions of the image intended to be exchanged. 

G 
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1 The method clearly identifies the subject 

2 target at any capturing angle and range and in any zoom 

3 state, and preferably in real time, so that the 

4 original billboard will not be perceptible to the TV 

5 viewers. The method typically identifies, in any 

6 frame, a relatively large number of targets (up to 20 

7 targets or more in an extreme case) . 

8 Blocking objects and shadows are 

9 distinguished from the background image by means of: 

10 comparing the detected target (partially blocked) 

11 with the same target stored in the system's data bank, 

12 The smooth and processed difference image between the 

13 two is the image of hidden surfaces which forms a part 

14 of the blocking object. This procedure may be 

15 implemented also by using correlation windows and 

16 identifying a low value of the correlation coefficient 

17 as being due to occlusion, 

18 motion detection - to identify objects that move 

19 with respect to the background, 

20 texture and geometric shape - distinguishing a 

21 player, ball or shadow from a sign, slogan or graphic 

22 image etc., and 

23 colour - and shades of colour. 

24 The electronic exchange is preferably instant 

25 and unnoticeable by the viewer since a perceptible 

26 exchange is usually unaccepted by the TV networks. 

27 Alternatively, it is possible to continuously "fade" 

28 the original image while enhancing the virtual image. 

29 False identification of targets and images is 

30 preferably avoided. 

31 The substituted target should be localized to 

32 sub-pixel accuracy so that the replacing target be 

33 spatially fixed with respect to the frame during the 

34 whole succession of TV frames in which the target is 

35 inside the camera's field of view. This accuracy is due 

36 to the fact that the human eye is sensitive to sub- 

37 pixel motions. 
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1 The methods preferably employ special 

2 parallel and pipelined processing hardware which will 

3 allow carrying out simultaneously the large number of 

4 operations involved in this process, 

5 The method of this invention preferably uses 

6 two optional sub-systems: 

7 a) Digital Image Converter and Storage Unit 

8 consisting of an electro-optical scanner for digital 

9 conversion and storage of virtual images, for 

10 constructing a memory unit for images such as 

11 advertisements. The system may also have the 

12 possibility of inputting images such as advertisements 

13 in other ways, as by digital interface (magnetic, 

14 optical disc, communication link) or video port, and 

15 may further include a graphics programme and man- 

16 machine interface for designing virtual images (like 

17 slogans) "on-the-spot"- 

18 b) Locale "learning" and storage system - for 

19 creating a data bank of targets and fixed objects in 

20 locales such as stadiums and fields, including: signs 

21 (location, shape, colour and type - one-time, seasonal, 

22 etc.)/ court markers (lines, colour, goal/basket, 

23 post) , etc. 

24 These two sub-systems can operate off-line or 

25 can be part of the basic system. The system can 

26 "learn" the details of the court in the course of a 

27 live event and create/update its data bank for future 

28 use. This can also be done using the trial shots taken 

29 before the event starts. 

30 The method involves the following steps: 

31 When the live or previously recorded video 

32 film is being transmitted, the following steps take 

33 place: 

34 1) Frame grabbing and digitization - each 

35 video frame is grabbed and each pixel value is 

36 digitized and stored in system memory, 

37 .2) Searching - the captured video frame 

38 is scanned to detect either actual physical displays 
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1 (like the icons stored in the memory) or background 

2 regions suitable for implantation whose specifications 

3 have been pre-defined* After detection, suspected 

4 targets, i.e. displays, are checked for unequivocal 

5 identification. This is accomplished by identification 

6 of messages and graphics in the displays, or of colour 

7 and texture attributes using standard pattern 

8 recognition techniques like edge correlation and region 

9 matching methods, character recognition, neutral 

10 network techniques and so on. After the target 

11 (display) has been identified and accurately localized, 

12 its optical magnification and perspective are computed 

13 and the locations of all other stored targets 

14 (displays) in the frame are consecutively predicted 

15 using the locale's lay-out in the data bank, giving the 

16 system positive search clues for additional targets in 

17 the same video frame. 

18 3) Blocked surface identification - when a 

19 given message area or display region is positively 

20 identified in a frame, the target (display) is compared 

21 with its properly scaled stored image (icon) and those 

22 areas of the display that are temporarily blocked by an 

23 object such as by the body of a player, by a ball or a 

24 shadow etc. are revealed after proper smoothing and 

25 processing of the results. The pixel addresses of these 

26 surfaces are stored so that these surfaces will later 

27 be superimposed on the substituted image. 

28 4) Scaling, perspective transformation and 

29 substitution - when a physical image display or a 

30 free location is identified and localized, the memory 

31 of the system is searched to find the desired virtual 

32 image to be substituted or implanted. The exchanged 

33 virtual image (patch) is then first normalized to 

34 acquire the proper size and perspective of the original 

35 physical image and identified blocked surfaces are then. 

36 removed, so that the exchanged image looks like a 

37 background display, or as a painted display on the 

38 court. 
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1 5) Real-time video tracking - typically a 

2 given display is visible for a few second before it 

3 moves out of the camera's field of view. The system 

4 preferably uses previous frames' information to track a 

5 given display throughout this succession of frames. To 

6 do that, conventional video tracking techniques, such 

7 as edge, centroid or correlation tracking methods, are 

8 executed. These methods should incorporate subpixel 

9 accuracy estimates. Tracking of players or of the ball 
can also be instrumental to identify blocking portions 

11 in the case where target icons are not stored in the 

12 system memory or for implantation in free regions. 

13 There is thus provided, in accordance with a 

14 preferred embodiment of the present invention, 

15 apparatus for advertisement incorporation including a 

16 field grabber operative to grab and digitize at least 

17 one field representing at least a portion of a sports 

18 facility, and an advertisement incorporator operative 

19 to incorporate, into at least one field, an 

20 advertisement whose contents varies over time, 

21 Further in accordance with a preferred 

22 embodiment of the present invention, the advertisement 

23 incorporator includes an advertisement site detector 

24 operative to detect at least one advertisement site in 

25 at least one field on a basis other than location of 

26 the advertisement site relative to the sports facility. 

27 Still further in accordance with a preferred 

28 embodiment of the present invention, the advertisement 

29 incorporaJtor^^^ is operative to incorporate an 
[verti semen tlitt^ at least one field at a partially 

1^ occluded advertisement site within the sports facility. 

Still fiHfther in accordance with a preferred 
33 embodiment of the present invention, the contents of 
4-^the advertisement varies in accordance with a 

35 pregteterroined BOhedtrie. 

36 Additionally in accordance with a preferred 

37 embodiment of the .present invention, the contents of 
the advertisement varies in accordance with an external 



30 
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1 input . 

2 Further in accordance with a preferred 

3 embodiment of the present invention, the advertisement 

4 incorporator also includes an audience noise evaluator 

5 operative to detect and evaluate a level of noise 

6 generated by an audience and to provide a noise level 

7 input to the advertisement incorporator and wherein the 

8 contents of the advertisement varies in accordance with 

9 the noise level input. 

10 There is additionally provided, in accordance 

11 with a preferred embodiment of the present invention, 

12 a method for advertisement incorporation including 

13 grabbing and digitizing at least one field representing 

14 at least a portion of a sports facility, and 

15 incorporating into at least one field, an advertisement 

16 whose contents varies over time. 
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1 
2 

3 The present invention will be understood and 

4 appreciated more fully from the following detailed 

5 description, taken in conjunction with the drawings and 

6 appendices in which: 
7 

8 Fig. 1 is a logical flow diagram of the 

9 processes and tasks required in accordance with a 

10 preferred embodiment of the method of the present 

11 invention; 

12 Pig. 2 is a block diagram of the basic and 

13 sub-system modules in accordance with a preferred 

14 embodiment of the present invention; 

15 Fig. 3 is a block diagram of a basic 

16 processing unit; 

17 Fig. 4 illustrates a minimum basic on-line 

18 system in accordance with a preferred embodiment of the 

19 present invention; 

20 Fig. 5 illustrates a minimum basic off-line 

21 system in accordance with the invention; 

22 Fig. 6 illustrates a system in accordance 

23 with a preferred embodiment of the present invention 

24 adapted for cable TV application; 

25 Fig. 7 is a simplified block diagram of a 

26 real time system for advertisement site detection and 

27 advertisement incorporation, constructed and operative 

28 in accordance with a preferred embodiment of the 

29 present invention; 

30 Pig. 8 is a simplified block diagram of the 

31 parallel processor and controller of Pig. 7; 

32 Fig, 9 is a simplified block diagram of an 

33 alternative embodiment of a real time system, for 

34 advertisement site detection and advertisement 

35 incorporation; 

36 Pig. lOA is a simplified flowchart of a 

37 preferred method of operation of the parallel processor 

38 and controller of Fig. 7, when only a single 
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1 advertisement site is to be identified and only a 

2 single advertisement is to be incorporated at that 

3 site; 

4 Fig. lOB is a simplified flowchart of a 

5 preferred method of operation of the parallel processor 

6 and controller of Fig. 1, when a plurality of 

7 advertisement sites is to be identified and a 

8 corresponding plurality of advertisements, which may or 

9 may not differ in content, is to be incorporated at 

10 those sites; 

11 Fig. 11 is a simplified flowchart of a 

12 preferred method for performing the segmentation step 

13 of Figs. lOA and lOB; 

14 Fig. 12 is a simplified flowchart of a 

15 preferred model matching method for performing the 

16 advertisement content identification step of Figs. lOA 

17 and lOB; 

18 Fig. 13 is a simplified flowchart of a 

19 preferred method for performing the localization step 

20 of Figs. lOA and lOB; 

21 Fig. 14 is a simplified flowchart of a 

22 preferred method for performing the tracking step of 

23 Figs. lOA and lOB; 

24 Fig. 15 is a simplified flowchart of a 

25 preferred method for performing the occlusion analysis 

26 step of Figs. lOA and lOB; 

27 Fig. 16 is a simplified flowchart of a 

28 preferred method for performing the advertisement 

29 incorporation step of Figs. lOA and lOB; 

30 Fig. 17 is a simplified block diagram of 

31 camera monitoring apparatus useful in conjunction with 

32 the advertisement site detection/incorporation 

33 apparatus of Fig. 7; 

34 Fig. 18 is a simplified flowchart of a 

35 preferred method for processing the output of the 

36 occlusion analysis process of Fig, 15 in order to take 

37 into account images from at least one off-air camera; 
38. Fig. 19 is a simplified flowchart of a 

13 
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1 preferred method for detecting and tracking moving 

2 objects of central interest; and 

3 Appendix A is a computer listing of a 

4 software implemented non-real time system for 

5 advertisement site detection and advertisement 

6 incorporation, constructed and operative in accordance 

7 with an alternative embodiment of the present 

8 invention. 
9 

10 
11 
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13 
14 
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17 
18 
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1 
2 
3 

4 Referring now to Fig. 1, in a preferred 

5 embodiment of the present invention, the system and 

6 method are designed to automatically perform the 

7 substitution of physical targets with synthetic images 

8 in real time, although a simpler version of the 

9 invention can be used off-line. 

10 When operating the system, the modules 

11 required are illustrated in the block diagram of Pig. 

12 2. These include: 

13 a basic processing unit; 

14 an optional scanner/digitizer used to create the 

15 data bank of synthetic images from still pictures; and 

16 an optional sub-system composed of a TV camera, 

17 digitizer and memory to create the stadium data bank. 

18 As was mentioned before, there may be other methods to 

19 create the data bank of synthetic images. The locale's 

20 (stadium's) data bank may also be created from the 

21 trial shots taken before the game starts or even be 

22 incrementally built in the course of the game by means 

23 of a "learning" process or by using data supplied by 

24 the stadium owner, the advertiser or the TV network. 

25 Fig. 2 illustrates a block diagram of the 

26 apparatus used in the system, wherein 1, 2, n are a 

27 plurality of TV cameras in different positions, which 

28 are the usual TV network cameras, 3 is the basic 

29 processing unit described in Fig. 3, sub-system 4 

30 converts and stores synthetic images and sub-system 5 

31 is a "learning" and storage system for event locales 

32 and targets therein. The output 6 can be transmitted 

33 by cable, optical fiber or wirelessly. It can also be 

34 displayed and/or recorded. 

35 The basic processing unit required to operate 

36 the system in real-time is shown in Fig. 3. This 

37 module comprises: 

38 a frame grabber for colour image acquisition; 

IS 
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1 a plurality of image memories; 

2 a fast parallel processor; 

3 a program memory; 

4 data banks of synthetic images to be substituted 

5 and of locale's lay-outs and target icons; 

6 a man/machine interface for control and for local 

7 display and recording; and 

8 ^ an image digital to analog converter. 

9 The above apparatus is used to automatically 

10 locate in real time in each video frame, suitable areas 

11 within a stadium which have physical displays or might 

12 be suitable for embodying such displays, and to 

13 substitute for such physical displays, or introduce 

14 into such areas, virtual images which are stored in the 

15 memory of the system to serve as advertisements in the 

16 background. 

17 These electronic inserted images will be seen 

18 by viewers as if they are physical displays located in 

19 a stadium and all action taking place in front of the 

20 actual physical display will appear to the viewer to be 

21 taking place in front of the virtual image as well- 

22 Fig. 4 illustrates an on-line system in 

23 accordance with an aspect of this invention consisting 

24 of a video camera 10, video processing unit 12 and 

25 work station 14 that provides the required man/machine 

26 interface. 

27 Fig. 5 illustrates a basic off-line system in 

28 accordance with one aspect of this invention. In this 

29 case, a video tape 20, a video cassette recorder or a 

30 video disk is the input rather than a TV camera and 

31 this is processed by the processing unit 22 and work 

32 station 24 to provide a video tape output 26 with 

33 substituted images. 

34 Fig, 6 illustrates yet another application of 

35 the system of this invention, namely a cable TV center. 

36 The center 30 receives transmissions from stations 32 

37 and 34. These transmissions are processed by the 

38 processing unit 22 and work station 24 and broadcast 



wo 95/10919 



PCT/US94/01679 



1 With substituted advertisements to subscribers from the 

, 2 center 30. 

3 Although a preferred system according to this 

4 invention superimposes blocking objects and shadows on 

5 the virtual images, a less sophisticated and much 

6 cheaper system is also intended as part of this 

7 invention, and that is a system where virtual images 

8 are exchanged for physical without relating to blocking 

9 objects. 

10 Such a system can be quite useful for 

11 substituting images in unblocked regions, for example 

12 high up in a stadium. 

13 Although a preferred embodiment of the 

14 present invention automatically detects and recognizes 

15 a given billboard in each TV frame, a less 

16 sophisticated system is also intended as part of this 

17 invention. In such a less sophisticated system the 

18 selection of a given sign to be substituted is done 

19 "manually" by a pointer such as a light pen or a cursor 

20 (operated by a mouse) with a human operator in the 

21 loop. 

22 This system is mainly off-line. When it is 

23 used on-line in real time it will be very difficult for 

24 the operator to perform the pointing task since in a 

25 typical scenario the sign is cntinuously visible for 

26 only short periods of a few seconds each. 

27 In such a mode of operation the replacement 

28 will nevertheless be perceptible to the TV viewers. 

29 This annoys the spectators and in many cases is not 

30 permitted by the TV networks. 

31 From the above description of the invention, 

32 it is apparent that the system, method and apparatus 

33 described above can have many applications. Thus, it 

34 is also possible to introduce virtual images, such as 

35 slogans or graphic advertisement, on the uniforms of 

36 players, particularly when a player is shown in close- 

37 up. In such a case, the outline of the player, or at 

38 least his shirt or helmet, would be the target for 
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1 implanting a virtual image. 

2 Another possible application is the automatic 

3 generation of continuous video films showing only 

4 sequences wherein specific targets, which have been 

5 pre-selected, appear to the exclusion of sequences 

6 where these targets do not appear. Such video films 

7 can be useful for analyzing and monitoring the activity 

8 of specific targets, for example individual players and 

9 their performance throughout an entire team game. This 

10 enables tracking each individual throughout an entire 

11 game without having to replay the entire cassette for 

12 each player. 

13 Another application of this invention is to 

14 generate statistical data of targets such as 

15 advertisements, for example the number of times and 

16 accumulated period that an advertisement appears on 

17 the screen, and to debit acccordingly . 

18 The implanted image can be in the form of a 

19 fixed, blinking or scrolling image, or it may be an 

20 animated film or video clip. 

21 Pig, 7 is a simplified block diagram of a 

22 real time system for advertisement site detection and 

23 advertisement incorporation, constructed and operative 

24 in accordance with a preferred embodiment of the 

25 present invention. 

26 The apparatus of Fig. 7 includes a video 

27 input source 100, such as a video camera, video 

28 cassette, broadcast, video disk, or cable transmission, 

29 which is connected, via a suitable connector, with a 

30 field grabber 110, preferably, or alternatively with a 

31 frame grabber. Henceforth, use of the term "field 

32 grabber" is intended to include frame grabbers. 

33 The field grabber 110 provides grabbed and 

34 digitized fields to a parallel processor and controller 

35 120, described in more detail below with reference to 

36 Fig. 8, which is preferably associated with a video 

37 display 130 which provides. an interactive indication to 

38 a user of advertisement site detection and adver- 

ft 
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1 tiseraent incorporation operations of the system. 

2 Preferably a light pen 140 is associated with the video 

3 display 130. 

4 According to an alternative embodiment of the 

5 present invention, the system receives an indication 

6 from a user of the presence in the field of view of one 

7 or more advertisements to be replaced and of the 

8 location/s thereof. The user input may, for example, be 

9 provided by means of a light pen 140. The indication 

10 provided by the user may comprise a single indication 

11 of an interior location of the advertisement, such as 

12 the approximate center of the advertisement or may 

13 comprise two or four indications of two opposite 

14 vertices or all four vertices, respectively, of an 

15 advertisement to be replaced. 

16 Optionally, the user also provides an 

17 indication of the contents of the advertisement. For 

18 example, a menu of captions identifying advertisements 

19 to be replaced, may be provided on the video display 

20 130 adjacent or overlaying a display of the playing 

21 field and the user can employ the light pen to identify 

22 the appropriate caption. 

23 An advertisement images and advertisement 

24 arrangement database 150 is provided which may be 

25 stored in any suitable type of memory such as computer 

26 memory or secondairy memory, such as a hard disk. The 

27 advertisement image and arrangement database 150 

28 typically stores a plurality of advertisement images, 

29 typically still images, including images to be replaced 

30 and/or images to be incorporated into the image of the 

31 playing field, either replacing an existing 

32 advertisement or in a location not presently occupied 

33 by an advertisement. 

34 The database 150 may also include an 

35 indication of the arrangement . of a plurality of 

36 advertisements to be replaced, if the arrangement is 

37 known ahead of time. Typically, the indication of the 

38 arrangement does not include an indication of the 



wo 95/10919 



PCTAJS94/01679 



1 location of each advertisement relative to the playing 

2 field, but instead includes an indication of the order 

3 in which the advertisements to be replaced will be 

4 arranged in the field. For example, a sequence of 20 

5 side-by-side advertisements may be arranged around 

6 three sides of a playing field. The database 150 may 

7 then include an indication of the sequence in which the 

8 advertisements are arranged. 

9 Advertisement images in the database ISO may 

10 be provided by field grabber 110 or from any suitable 

11 advertisement image source 160, such as but not limited 

12 to an image generating unit such as a image processing 

13 workstation, a scanner or other color reading device, 

14 any type of storage device, such as a hard disk, a CD 

15 ROM driver, or a communication link to any of the 

16 above. 

17 The video output of the system may be 

18 provided via a suitable connector to suitable eqtuipment 

19 for providing wireless or cable transmission to 

20 viewers. 

21 Fig. 8 is a simplified block diagram of the 

22 parallel processor and controller 120 of Fig. 7. The 

23 parallel processor/controller 120 preferably includes 

24 an advertisement site detection/content identification 

25 unit 17 0, a g lurality of parallel tracking modules 180, 

26 an ^^^cclusioi^ and advertisement incorporation 

27 unit T9TT7"a"*vrdeo encoder 200 and a controller 210. 

28 The advertisement site detection/content 

29 identification unit 170 of Fig. 8 may be implemented 

30 based on a suitable plurality of suitable image 

31 processing boards, such as Ariel Hydra boards, 

32 commercially available from Ariel, USA. Each of these 

33 preferably incorporates four TMS320C40 digital signal 

34 processors, a DRAM of 64 MB, an SRAM of 1 MB, and a VME 

35 bus interface. A specially designed coprocessor is 

36 preferably added to these boards to perform the 

37 segmentation task. The image processing boards are 

38 programmed based on the advertisement site detection 

20 



wo 95/10919 



PCT/US94/01679 



1 and content identification methods of Figs. 11 and .12 

2 on which Appendix A is based in part. For example, the 

3 appropriate portions of the listing of Appendix A may 

4 be converted into Assembler and the resulting code may 

5 be loaded into the digital signal processor of the 

6 image processing board. 

7 Each of parallel tracking modules 180 may be 

8 implemented based on one or more image processing 

9 boards, such as Ariel Hydra boards, commercially 

10 available from Ariel, USA. Each of these preferably 

11 incorporates four TMS320C40 digital signal processors, 

12 a DRAM of 64 MB, an SRAM of 1 MB, and a VME bus 

13 interface. The image processing boards are programmed 

14 for parallel operation based on the tracking method of 

15 Fig. 14 on which Appendix A is based in part. For 

16 example, the appropriate portions of the listing of 

17 Appendix A may be converted into Assembler and the 

18 resulting code may be loaded into the digital signal 

19 processor of the image processing board. 

20 The occlusion analysis and advertisement 

21 incorporation unit 190 may also be based on one or more 

22 texture mapping boards such as the Fairchild's Thru-D 

23 boards with the appropriate bus bridges, programmed 

24 based on the occlusion analysis and advertisement 

25 incorporation methods of Figs. 15 and 16 on which 

26 Appendix A is based in part. For example, the 

27 appropriate portions of the listing of Appendix A may 

28 be converted into Assembler and the resulting code may 

29 be loaded into the processor of the texture mapping 

30 board. 

31 Video encoder 200 is operative to perform D/A 

32 conversion. 

33 Controller 210 may, for example, comprise a 

34 4 86 PC programmed based on the control method of Pigs. 

35 lOA - lOB on which Appendix A is based in part. For 

36 example, the appropriate portions of the listing of 

37 Appendix A may be Intel 486 PC processor. 

38 Fig. 9 is a simplified block diagram of an 
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1 alternative embodiment of a real time system for 

2 advertisement site detection and advertisement 

3 incorporation • In the apparatus of Fig. 9, a 

4 conventional workstation 212, having its own video 

5 display 220 and its own field grabber (not shown), such 

6 as a Silicon Graphics Onyx workstation loaded with a 

7 video board and a suitable software, replaces the 

8 following units of Fig. 7: field grabber 110, the 

9 parallel processor and controller 120 other than the 

10 advertisement site detection and content identification 

11 unit 170 and tracking modules 180 thereof, the video 

12 display, and the database 150. 

13 The software for the workstation may be based 

14 on the Appendix A implementation of the method of Figs. 

15 lOA - lOB, suitably converted into the workstation's 

16 environment, however some of the functions of Appendix 

17 A are preferably omitted. Specifically: 

18 a. the advertisement site detection and 

19 tracking functions, corresponding to the segmentation, 

20 advertisement content identification and tracking steps 

21 320, 330 and 310 respectively of Figs. lOA - lOB are 

22 omitted and are instead implemented in real time by 

23 dedicated hardware 230 in Fig. 9; and 

24 b. The texture mapping functions (second and 

25 third steps of Fig. 16) which preferably form part of 

26 the advertisement incorporation function, are 

27 preferably omitted and are, instead, performed by the 

28 texture mapping functions provided by the workstation 

29 itself. 

30 The dedicated hardware 230 of Fig. 9 may be 

31 similar to the advertisement site detection/content 

32 identification unit 1*70 and parallel tracking modules 

33 180 of Fig. 8. 

34 Appendix A is a computer listing of a non- 
35 real time software implementation of the present 

36 invention which is operative, for example, on a 486 PC 

37 in conjunction with a conventional frame grabber such 

38 as an Imaging MFG board. The method of Appendix A is 
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1 now described with reference to Figs. lOA - 16. 

2 Pig. lOA is a simplified flowchart of a 

3 preferred method of operation of the parallel processor 

4 and controller 120 of Fig. 7, when only a single 

5 advertisement site is to be identified and only a 

6 single advertisement image is to be incorporated at 

7 that site. 

8 < Fig. lOB is a simplified flowchart of a 

9 preferred method of operation of. the parallel processor 

10 and controller 120 of Fig. 7, when a plurality of 

11 advertisement sites is to be identified and a 

12 corresponding plurality of advertisement images, which 

13 may or may not differ in content, is to be incorporated 

14 at those sites respectively. 

15 The method of Fig. lOB typically includes the 

16 following steps, which are similar to the steps of Fig. 

17 lOA which are therefore not described separately for 

18 brevity: 

19 STEP 290: A- digitized video field is 

20 received from the field grabber 110 of Fig. 1. 

21 STEP 300: A decision is made as to whether or 

22 not at least one advertisement in the current field was 

23 also present in the previous field (and televised by 

24 the same camera) . If so, the current field is termed a 

25 "consecutive" field and the segmentation, content 

26 identification and localization steps 320, 330 and 340 

27 preferably are replaced only by a tracking step 310. If 

28 not, the current field is termed a "new" field. 

29 If the field is a "consecutive" field, the 

30 plurality of advertisements is tracked (step 310), 

31 based on at least one advertisement which was present 

32 in a previous field, since the present field is a 

33 "consecutive" field. 

34 If the field is a "new" field, the 

35 advertisement site at which an advertisement is to be 

36 incorporated is identified in steps 320, 330 and 340. A 

37 loop is performed for each advertisement from among the 

38 plurality of advertisements to be processed. 

2.S 



wo 95/10919 



PCT/US94/01679 



1 Preferably, the segmentation and content identification 

2 steps 320 and 330 are performed only for the first 

3 advertisement processed. 

4 In step 320, a pair of generally parallel 

5 lines is typically detected and the image of the field 

6 is segmented. Specifically, the portion of the field 

7 located within the two detected parallel lines, which 

8 typically correspond to the top and bottom boundaries 

9 of a sequence of advertisements, is segmented from the 

10 remaining portion of the field. 

11 Typically, the segmentation step 320 is 

12 operative to segment advertisements regardless of: 

13 their perspective relative to the imaging camera, the 

14 zoom state of the imaging camera lens, the location of 

15 the advertisement in the field of view (video field) , 

16 the angular orientation of the imaging camera relative 

17 to the ground and the location of the TV camera. 

18 The segmentation step 320 is typically 

19 operative to identify an empty or occupied 

20 advertisement site on a basis other than location, such 

21 as but not limited to any of the following, separately 

22 or in any combination: 

23 a. Geometrical attributes of the advertisement's 

24 boundary such as substantially parallel top and bottom 

25 boundaries or such as four vertices arranged in a 

26 substantially rectangular configuration; 

27 b. A color or a combination of colors or a color 

28 pattern, which are known in advance to be present in 

29 the advertisement image. 

30 c. The spatial frequencies band of the 

31 advertisement image, which is typically known in 

32 advance. Typically, the known spatial frequencies band 

33 is normalized by the height of the advertisement which 

34 may, for example, be derived by computing the distance 

35 between a pair of detected horizontal lines which are 

36 known to be the top and bottom boundaries of the 

7 advertisement sec[uence. 

8 In step 330, the content of the portion 
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1 between the two substantially parallel lines is matched 

2 to a stored representation of an advertisement to be 

3 replaced. 

4 Steps 320 and 330 allow advertisement sites 

5 to be identified and the content thereof to be matched 

6 to a stored model thereof, even if cuts (transitions, 

7 typically abrupt, between the outputs of a plurality of 

8 cameras which are simultaneously imaging the sports 

9 event) occur during the sports event. Typically, at 

10 each cut, steps 320 and 330 are performed so as to 

11 identify the advertisement within the first few fields 

12 of the cut. Until the next cut occurs, the identified 

13 advertisement is typically tracked (step 310). 

14 In step 340, the advertisement is localized 

15 at subpixel accuracy. 

16 Finally, for each advertisement, occlusion 

17 analysis is performed (step 350) and the replacing 

18 advertisement is incorporated in the advertisement site 

19 (step 360). Alternatively, the occlusion analysis and 

20 advertisement incorporation steps are replaced by an 

21 advertisement enhancement step in which the existing 

22 advertisement is enhanced, using conventional edge 

23 sharpening techniques, rather than being replaced. 

24 Optionally, a fee accumulation step 362 is 

25 performed, typically after occlusion analysis step 350. 

26 In the fee accumulation step, a fee for each 

27 advertisement is accumulated. The fee may be computed 

28 on any suitable basis. For example, the fee may be 

29 determined by counting the total amount of time for 

30 which the advertisement was displayed and for which at 

31 least 50% of the advertisement was unoccluded, and 

32 multiplying by a fixed dollar rate per time unit. 

33 Alternatively, the proportion of the unoccluded area of 

34 the advertisement may be computed for each time 

35 interval, such as each second. Optionally, the display 

36 time or the sum over time of the displayed area may be 

37 adjusted to take into account the game's progress. For 

38 example, the display time or the sum over time of the 
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1 displayed area may be multipled by an externally 

2 provided index indicating the tension level of the game 

3 during display of the advertisement. High tension level 

4 may, for example, mean that the game has gone into 

5 overtime or that a significant event, such as a goal, 

6 has occurred during display or just before display. 

7 Alternatively, the tension level index may be provided 

8 by'? the system itself. For example, a voice recognition 

9 unit may recognize significant words uttered by the 

10 sports commentator, such as the word "goal". 

11 According to an alternative embodiment of the 

12 present invention, the segmentation and advertisement 

13 content identification steps 320 and 330 respectively 

14 may be omitted if physical landmarks identifying the 

15 locations of advertisements to be replaced whose 

16 contents is known in advance, are positioned and 

17 captured ahead of time in the playing field, 

18 Fig. 11 is a simplified flowchart of a 

19 preferred method for performing the segmentation step 

20 320 of Figs. lOA and lOB. 

21 The method of Fig. 11 preferably includes the 

22 following steps: 

23 STEP 380: A new field is received and the 

24 resolution thereof is preferably reduced since the 

25 forgoing steps may be performed adequately at a lower 

26 resolution, for example, a low pass filter may be 

27 employed to reduce a 750 x 500 pixel field to 128 x 128 

28 pixels. 

29 STEP 390: Optionally, the low resolution 

30 image is smoothed, e.g. by median filtering or low pass 

31 filtering, so as to remove information irrelevant to 

32 the t^sk of searching for long or substantially 

33 horizontal lines. 

34 STEP 400: Edges and lines (two-sided edges) 

35 are detected, using any suitable edge detection method 

36 such as the Canny method, described by J.F. Canny in "A 

37 computational approach to edge detection", IEEE Trans. 

38 Pattern Analysis and Machine Intelligence, Vol. 8, pp. 
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1 679-698, November, 1986. 

2 STEP 404: The edges detected in step 400 are 

3 thinned and components thereof are connected using 

4 conventional techniques of connectivity analysis. The 

5 edges are thresholded so as to discard edges having too 

6 small a gradient. 

7 STEP 408: The edges detected in steps 400 and 

8 410 are compared pairwise so as to find strips, i.e. 

9 pairs of parallel or almost parallel lines which are 

10 relatively long. If there are no such pairs, the method 

11 terminates. 

12 STEP 412: Find the spatial frequency spectrum 

13 within each strip and reject strips whose spatial 

14 frequency contents are incompatible with the spatial 

15 frequency band expected for advertisements. Typically, 

16 the rejection criterion is such that more than one 

17 strip, such as 3 or 4 strips, remain. 

18 STEP 416: Rank the remaining strips and 

19 select the highest ranking strip. The rank assigned to 

20 a strip depends on the probability that the strip 

21 includes advertisements. For example, the strip in the 

22 lowest location in the upper half of the field is given 

23 higher rank than strips above it, because the strips 

24 above it are more likely to be images of portions of 

25 the stadium. The lowest located strip is more likely to 

26 be the advertisements which are typically positioned 

27 below the stadium. 

28 Strips adjacent the bottom of the field are 

29 given low rank because the advertisements would only be 

30 imaged toward the bottom of the video field if the 

31 playing field is not being shown at all, which is 

32 unlikely. 

33 Fig. 12 is a simplified flowchart of a 

34 preferred model matching method for performing the 

35 advertisement content identification step 330 of Figs. 

36 lOA and lOB. Alternatively, advertisement content 

37 identification may be provided by a user, as described 

38 above with reference to Fig. 1. 

^7 
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1 The method of Fig. 12 is preferably performed 

2 in low resolution, as described above with reference to 

3 step 380 of Fig. 11. The method of Fig. 12 preferably 

4 includes the following steps: 

5 STEP 420: The forgoing steps 424, 430, 436, 

6 440, 444 and 452 are performed for each almost 

7 parallel strip identified in segmentation step 320 of 

8 Fig. 11. 

9 STEP 424: The distance and angle between the 

10 two lines of each strip is computed and the scale and 

11 approximate perspective at which the strip was imaged 

12 is determined therefrom. 

13 STEP 430: During set-up, each advertisement 

14 model is divided into a plurality of windows. Steps 

15 436, 440 and 444 are performed for each window of each 

16 advertisement model. For example, if there are 5 models 

17 each partitioned into 6 windows, this step is performed 

18 30 times. 

19 STEP 436: A one-dimensional similarity search 

20 is carried out for the suitably scaled current model 

21 window k, along the current almost parallel strip. 

22 Typically, a cross-correlation function may be computed 

23 for each pixel along the current strip. 

24 STEP 440: The cross-correlation function 

25 values obtained in step 436 are thresholded. For 

26 example, values exceeding 0.6 may be assigned the value 

27 1 (correlation) whereas values under 0.6 may be 

28 assigned the value 0 (no correlation). The I's are 

29 weighted, depending on the "significance" of their 

30 corresponding windows. The "significance" of each 

31 window is preferably determined during set-up such that 

32 windows containing more information are more 

33 "significant" than windows containing little 

34 information. 

35 STEP 444: At this stage, weighted thresholded 

36 cross-correlation function values have been computed 

37 which represent the results of matching the contents of 

38 each position along the strip (e.g. of each of a 
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1 plurality of windows along the strip which are spaced 

2 at a distance of a single pixel) to each window of each 

3 model advertisement known to occur within the strip. 

4 The weighted thresholded cross-correlation 

5 function values are accumulated per all windows 

6 composing a model sign or a model strip. 

7 STEP 452: A decision is made as to the 

8 approximate location of the sequence of advertising 

9 models, within the strip. It is appreciated that, once 

10 the location of one advertisement model has been 

11 determined, the locations of the other advertisemnt 

12 models in the same sequence are also determined, 

13 knowing the scale ad approximate perspective of the 

14 imaged strip. 

15 Pig. 13 is a simplified flowchart of a 

16 preferred method for performing the precise 

17 localization step 340 of Figs. lOA and lOB. In Fig. 13, 

18 the advertisement model which was approximately 

19 localized by the method of Fig. 12, is localized with 

20 subpixel accuracy. Accurate localization is typically 

21 performed only for new fields. For "consecutive" 

22 fields, the advertisement ' s location is preferably 

23 measured by video tracking. 

24 The method of Fig. 13 preferably includes the 

25 following steps: 

26 STEP 460: From Fig. 12, the following 

27 information is available per advertisement detected: 

28 one location within the advertisement, such as one 

29 vertex thereof, the advertisement scale height in the 

30 image and its approximate perpsective. This information 

31 is employed to compute the four vertices of each 

32 detected advertisement. 

33 STEP 464: A perspective transformation is 

34 computed which describes how to "transform" the 

35 typically rectangular model into the detected 

36 advertisement area which is typically non-rectangular 

37 due to its pose relative to the imaging camera. 

38 STEP 468: The contents of each of a plurality 
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1 of model tracking windows to which the model is divided 

2 during set up, is mapped into the video field, using 

3 the perspective transformation computed in step 464. 

4 STEP 470: Steps 472 and 476 are performed for 

5 each of the model tracking windows. 

6 STEP 472: The current model tracking window 

7 is translated through a search area defined in the 

8 video field. For each position of the model tracking 

9 window within the search area, a similarity error 

10 function (like cross-correlation or absolute sum of 

11 differences) is computed. Typically, the model tracking 

12 window has 8 x 8 or 16 x 16 different positions within 

13 the search area. 

14 STEP 476: The minimum similarity error 

15 function for the current model tracking window is 

16 found. Preferably, the minimum is found at subpixel 

17 accuracy, e.g. by fitting a two-dimensional parabola to 

18 the similarity error function generated in step 472 and 

19 computing the minimum of the parabola. This minimum 

20 corresponds to the best position, at "subpixel 

21 accuracy", for the current model tracking window within 

22 the video field. 

23 If (STEP 480) the similarity error function 

24 minima are high for all tracking windows, i.e. none of 

25 the tracking windows can be well matched to the video 

26 field, then (STEP 482) processing of the current frame 

27 is terminated and the method of Fig. lOA, from step 320 

28 onward, is performed on the following frame. 

29 STEP 484: Tracking windows which have a high 

30 similarity error function minimum are rejected. 

31 Typically, approximately 30 tracking windows remain. 

32 STEP 488 is a stopping criterion determining 

33 whether or not to perform another iteration of 

34 localization by matching tracking windows. Typically, 

35 if the tracking windows' centers are found to converge, 

36 relative to the centers identified in the last 

37 iteration, the process is terminated. Otherwise, the 

38 method returns to step 464. 

y6 
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1 STEP 490: Once the tracking window locations 

2 have converged, the perspective transformation between 

3 the images advertisement and its model is recomputed. 

4 Fig. 14 is a simplified flowchart of a 

5 preferred method for performing the tracking step 310 

6 of Figs. lOA and lOB. The method of Fig. 14 preferably 

7 includes the following steps: 

8 STEP 492: A perspective transformation is 

9 performed on the model tracking windows and the 

10 contents thereof are mapped into the video field. This 

11 step employs the system's knowledge of the location of 

12 the advertisement in the previous field and, 

13 preferably, predicted scanning speed of the camera 

14 imaging the sports event. 

15 STEP 496: Steps 498 and 500, which may be 

16 similar to steps 472 and 476, respectively, of Fig. 13, 

17 are performed for each model tracking window. 

18 STEPS 508 AND 512 may be similar to steps 488 

19 and 490 of Fig. 13. 

20 STEP 510: If the window center locations do 

21 not yet converge, step 492 is redone, however, this 

22 time, the texture mapping is based upon the perspective 

23 transformation of the previous iteration. 

24 STEP 520: The coefficients of the perspective 

25 transformation are preferably temporally smoothed, 

26 since, due to the smoothness of the camera's scanning 

27 action, it can be assumed that discontinuities are 

28 noise. 

29 Fig. 15 is a simplified flowchart of a 

30 preferred method for performing the occlusion analysis 

31 step 350 of Figs. lOA and lOB. The method of Fig. 15 

32 preferably includes the following steps: 

33 STEP 530: The advertisement image in the video 

34 field is subtracted from its perspective transformed 

35 model, as computed in step 512 of Fig. 14 or, for a new 

36 field, in step 390 of Fig. 13. 

37 STEP 534: Preferably, the identity of the 

38 advertisement image and the stored advertisement is 
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1 verified by inspecting the difference values computed 

2 in step 530. If the advertisement image and the stored 

3 advertisement are not identical, the current field is 

4 not processed any further. Instead, the next field is 

5 processed; starting from step 320 of Fig. lOB. 

6 STEP 538: The internal edge effects are 

7 filtered out of the difference image computed in step 

8 530 since internal edges are assumed to be artifacts. 

9 STEP 542: Large non-black areas in the 

10 difference image are defined to be areas of occlusion. 

11 STEP 546: The occlusion map is preferably 

12 temporally smoothed since the process of occlusion may 

13 be assumed to be continuous. 

14 Fig. 16 is a simplified flowchart of a 

15 preferred method for performing the advertisement 

16 incorporation step 360 of Figs. lOA and lOB. The method 

17 of Fig. 16 preferably includes the following steps: 

18 STEP 560: The resolution of the replacing 

19 advertisement model, i.e. the advertisement in memory, 

20 is adjusted to correspond to the resolution in which 

21 the advertisment to be replaced was imaged. Typically, 

22 a single advertisement model is stored in several 

23 different resolutions. 

24 STEP 570: The replacing advertisement is 

25 transformed and texture mapped into the video field 

26 pose, using tri-linear interpolation methods. This step 

27 typically is based on the results of step 512 of Fig. 

28 14 or, for a new field, on the results of step 390 of 

29 Fig. 13. 

30 STEP 580: Aliasing effects are eliminated. 

31 STEP 590: The replacing pixels are keyed in 

32 according to an occlusion map. The values of the 

33 replacing pixels may either completely replace the 

34 existing values, or may be combined with the existing 

35 values, as by a weighted average. For example, the 

36 second alternative may be used for edge pixels whereas 

37 the first alternative may be used for middle pixels. 

38 Fig. 17 is a simplified block diagram of 
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1 camera monitoring apparatus useful in conjunction with 

2 a conventional TV camera and with the advertisement 

3 site detection/incorporation apparatus of Fig. 7, If 

4 the parallel processor and controller of Fig. 7 is as 

5 illustrated in Fig. 8, the apparatus of Fig. 17 is not 

6 required and instead, a conventional TV camera may be 

7 employed. However, in the alternative, the automatic 

8 detection and content identification features of the 

9 system may be eliminated, by eliminating unit 170 of 

10 Fig. 8. In this case, the apparatus of Fig. 17 is 

11 preferably provided in operative association with the 

12 TV camera at the stadium or playing field. 

13 The apparatus of Fig. 17 provides camera 

14 information, including the identity of the "on-air" 

15 camera, its lens zoom state and the direction of its 

16 FOV center. This information may be employed, in 

17 conjunction with known information as to the positions 

18 and contents of advertisements in the stadium, in order 

19 to detect, identify and even roughly track each 
2 0 advertisement . 

21 The apparatus of Fig. 17 includes: 

22 (a) a plurality of conventional TV cameras 600 of 

23 which one is shown in Fig. 17; 

24 . (bj for each camera 600, a camera FOV (field of 

25 view) center direction measurement unit 610 at least a 

26 portion of which is typically mounted on the TV camera 

27 600 pedestal; 

28 (c) for each camera 600, a camera lens zoom state 

29 monitoring unit 620 which is typically mounted on the 

30 TV camera 600 pedestal . The monitoring unit 620 

31 receivse an output indication of the zoom state 

32 directly from the zoom mechanism of the camera; 

33 (d) an "on-air" camera identification unit 630 

34 operative to identify the camera, from among the 

35 plurality of TV cameras 600, which is being broadcast. 

36 This information is typically available from the 

37 broadcasting system control unit which typically re- 

38 ceives manual input selecting an on-air camera, from a 

3y . 
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1 producer; and 

2 (e) a camera information video mixer 640 

3 operative, to mix the output of units 610, 620 and 630 

4 onto the broadcast. Any suitable mixing may be 

5 employed, such as mixing onto the audio channel, mixing 

6 onto the time code, or mixing onto the video signal 

7 itself. 

8 r The camera FOV direction measurement unit 610 

9 may be implemented using any of the following methods, 

10 inter alia: 

11 a. On-camera NFM (North Finding Module) in 

12 conjunction with two inclinometers for measuring the 

13 two components of the local gravity vector angle with 

14 respect to the FOV center direction; 

15 GPS- (Global Position System) based direction 

16 measurement system; 

17 c. Triangulation - — positioning two RF sources 

18 at two known locations in the playing field or stadium 

19 and an RF receiver on the camera; 

20 d. an on-camera boresighted laser designator in 

21 combination with an off-camera position sensing 

22 detector operative to measure the direction of the beam 

23 spot generated by the lasr designator. 

24 Fig. 18 is a simplified flowchart of an 

25 optional method for processing the output of the 

26 occlusion analysis process of Fig. 15 in order to take 

27 into account images from at least one off-air camera. 

28 If the method of Fig. 18 is employed, a video 

29 compressor and mixer 700 are provided in operative 

30 association with the TV cameras which are imaging the 

31 event at the playing field or stadium, as shown in Fig. 

32 2. The output of the compressor and mixer 700, 

33 comprising compressed images of the playing field as 

34 imaged by all of the TV cameras other than the TV 

35 camera which is "on-air", blended with the broadcast 

36 signal, is broadcast to remote advertisement site 

37 detection/incorporation systems such as that 

38 illustrated in Fig. 7. The transmission provided by 

J9 
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1 compressor and mixer 700 of Pig. 2 is first decoded and 

.2 decompressed in step 710 of Fig. 18. 

3 . STEP 720: Steps 730, 740 and 750 are repeated 

4 for each advertisement site imaged by the "on air" 

5 camera * 

6 STEP 730: Although it is possible to employ 

7 information from more than one of the "off-air" 

8 cameras, preferably, only a single "off air" camera is 

9 employed to process each advertisement site and the 

10 single "off-air" camera is selected in step 730. For 

11 example, if the apparatus of Fig. 17 is provided, the 

12 output of camera FOV direction measurement unit 610 for 

13 each "off-air" camera may be compared in order to 

14 identify the "off-air" camera whose FOV direction is 

15 maximally different from the FOV direction of the "on- 

16 air" camera. Alternatively, particularly if the appa- 

17 ratus of Fig. 17 is omitted, a single "off-air" camera 

18 may be selected by performing preliminary analysis on 

19 the images generated by each of the "off-air" cameras 

20 in order to select the most helpful "off-air" camera. 

21 For example, the images generated by each "off-air" 

22 camera may be matched to the stored representation of 

23 the advertisement currently being processed. Then, the 

24 actual image may be warped and then subtracted from the 

25 stored representation for each "off-air" camera in 

26 order to obtain an estimate of the occlusion area for 

27 that camera and that advertisement. The camera with the 

28 minimal occlusion area may then be selected. 

29 STEP 740: The advertisement image of the 

30 selected "off-air" camera is warped onto the 

31 advertisement site as imaged by the "on-air" camera. 

32 STEP 750: The warped "off-air" advertisement 

33 image is subtracted from the "on-air" image and the 

34 difference image is filtered in order to compute the 

35 boundary of the occluding object at pixel-level 

36 accuracy. 

37 According to a preferred embodiment of the 

38 present invention, the advertisement to be incorporated 

35 
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1 in a particular location in the playing field or other 

2 locale may vary over time. This variation may be in 

3 accordance with a predetermined schedule, or in 

4 accordance with an external input. For example, a 

5 speech recognition unit may be provided which is 

6 operative to recognize key words, such as the word 

7 "goal" or the word "overtime", on the audio channel 

8 accompanying the video input to the system. In this 

9 way, an advertisement may be scheduled to be 
incorporated at particular times, such as just after 



a 



10 

11 goal or during overtime. 

12 In the present specification, the term 

13 "advertisement site" refers to a location into which an 

14 advertisement is to be incorporated. If an existing 

15 advertisement occupies the advertisement site, the new 

16 advertisement replaces the existing advertisement. 

17 However, the advertisement site need not be occupied by 

18 an existing advertisement. The term "occluded" 

19 refers to an advertisement site which is partially or 

20 completely concealed by an object, typically a moving 

21 object, in front of it. 

22 A particular feature of the present invention 

23 is that, when it is desired to track an advertisement 

24 site within a larger image, the entire image is not 

25 tracked, but rather only the advertisement site itself. 

26 Another particular feature is that "special" 

27 advertisements may be provided, such as moving, 

28 blinking or otherwise varying advertisements, video 

29 film advertisements, advertisements with changing 

30 backgrounds, and advertisements with digital effects. 

31 It is appreciated that the particular 

32 embodiment described in Appendix A is intended only to 

33 provide an extremely detailed disclosure of the present 

34 invention and is not intended to be limiting. 

35 The applicability of the apparatus and 

36 methods described above is not limited to the 

37 detection, tracking and replacement or enhancement of 

38 advertisements. The disclosed apparatus and methods 

36 
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1 may, for example, be used to detect and track moving 

2 objects of central interest, as shown in Fig. 19, such 

3 as focal athletes and such as balls, rackets, clubs and 

4 other sports equipment. The images of these moving 

5 objects may then be modified by adding a "trail" 

6 including an advertisement such as the logo of a 

7 manufacturer. 

8 It is appreciated that various features of 

9 the invention which are, for clarity, described in the 

10 contexts of separate embodiments may also be provided 

11 in combination in a single embodiment. Conversely, 

12 various features of the invention which are, for 

13 brevity, described in the context of a single 

14 embodiment may also be provided separately or in any 

15 suitable subcombination. 

16 It will be appreciated by those skilled in 

17 the art that, the invention is not limited to what has 

18 been shown and described hereinabove. Rather, the scope 

19 of the invention is defined solely by the claims which 

20 follow: 
21 
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